class: title-slide <br> <br> # Effects of Early Warning Emails on Student Performance <br> .padding_left.pull-down.white[ .font120[**_J. Klenke_**], T. Massing, N. Reckmann, J. Langerbein, B. Otto, M. Goedicke, C. Hanck <br> <br> <br> `\(15^{TH}\)` International Conference on Computer Supported Education Prague, 21-23 April, 2023 ] --- # Course Description - Analyzed Course: _Inferential Statistics_ at the University of Duisburg-Essen of the summer semester 2019 - Compulsory course for several business and economics programs - Weekly 2-hour lecture - Weekly 2-hour exercise - [Kahoot!](https://kahoot.com/) games were used to interact with students during classes - Homework and 5 online tests were offered on the e-assessment platform [JACK](https://s3.paluno.uni-due.de/en/forschung/spalte1/e-learning-und-e-assessment) - Information from __802__ individuals were collected from Moodle - __337__ students took an exam at the end of the semester --- # Theorie - Predicting students success and/or final grad is relatively early possible --- # Data and Warning Mail Decision - We used two data sources - The first three online tests results - The commutative homework points in JACK - Logit model was used to predict students probability to past the final exam soley based on the results in the frist three online tests - Model was trained with the latest data obtained from the same course given two years earlier - We hold the lecture only every other year - If the predicted probability was less than 0.4 the student was indicated with a warning - However, as the online tests were not mandatory we also considered students' general activity in JACK and modified the decision on whether a mail was sent --- # Course Timeline Main Events <br> <br> <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#plots/timeline_plot.png" alt="Timeline for the key events in the 2019 summer term course Inferential Statistics (treatment cohort). The shaded area indicates the period after treatment. There were 57 days between the warning email and the first opportunity to take the exam and 113 days between the warning email and the second opportunity." width="80%" /> <p class="caption">Timeline for the key events in the 2019 summer term course Inferential Statistics (treatment cohort). The shaded area indicates the period after treatment. There were 57 days between the warning email and the first opportunity to take the exam and 113 days between the warning email and the second opportunity.</p> </div> --- # Regression Discontinuity Design (RDD) - The treatment is __not__ randomly assigned and therefore more commonly used methods like OLS are not suitable - In our design the treatment is main a function of the predicted probability to pass the final exam - Consider the following suitable RDD representation: `$$Y_i = \beta_0 + \alpha T_i + \beta W_i + u_i$$` .padding_left.padding_left.font70[ <ul> <li> \(W_i\) denotes the predicted probability to pass the final exam</li> <li> \(T_i\) indicates if a student received a mail</li> <ul style="list-style-type: '‣ ';" > <li>\(T_i = 1[W_i \leq c]\) , with \(c = 0.4\) </li> </ul> <li>\(\alpha\) denotes the treatment effect</li> <li>\(u_i\) denotes the error term</li> </ul> ] - RDD compares the individuals around the cutoff to estimate an effect - This design is called _sharp_ RDD as the treatment and control group are perfectly separated -- <svg viewBox="0 0 192 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#004c93;" xmlns="http://www.w3.org/2000/svg"> <path d="M176 432c0 44.112-35.888 80-80 80s-80-35.888-80-80 35.888-80 80-80 80 35.888 80 80zM25.26 25.199l13.6 272C39.499 309.972 50.041 320 62.83 320h66.34c12.789 0 23.331-10.028 23.97-22.801l13.6-272C167.425 11.49 156.496 0 142.77 0H49.23C35.504 0 24.575 11.49 25.26 25.199z"></path></svg> `\(\;\)` This design is not suitable for our analysis as our groups are not perfectly separated --- # Fuzzy RDD - With the _fuzzy_ design it is possible to analyse a treatment for a setting where the two groups are not perfectly separated - Only the likelihood of receiving the treatment needs to _change_ - The effect is estimated through an instrumental variable estimation where in the first stage the `\(\widehat{T}_i\)` are estimated which then are inserted in the second stage - First Stage: `$$T_i = \gamma_0 +\gamma_i Z_i + \gamma_2 W_i + \nu_i \qquad \quad$$` - Second Stage: `$$Y_i = \beta_0 + \alpha \widehat{T}_i + \delta_1 W_i + \beta X_i + u_i$$` --- # RDD Graphic --- # Model Assumption <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#plots/test_cont_label.png" alt="your caption" width="80%" /> <p class="caption">your caption</p> </div> --- # Results <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#plots/model_plot_label.png" alt="your caption" width="80%" /> <p class="caption">your caption</p> </div> <style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;} .tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; overflow:hidden;padding:10px 5px;word-break:normal;} .tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px; font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;} .tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top} .tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top} </style> <table class="tg"> <thead> <tr> <th class="tg-0pky"></th> <th class="tg-c3ow">Model 1</th> <th class="tg-c3ow">Model 2</th> </tr> </thead> <tbody> <tr> <td class="tg-0pky">LATE</td> <td class="tg-c3ow">0.193<br>(4.889)</td> <td class="tg-c3ow">0.146<br>(4.852)</td> </tr> <tr> <td class="tg-0pky">Bandwith</td> <td class="tg-c3ow">0.255</td> <td class="tg-c3ow">0.255</td> </tr> <tr> <td class="tg-0pky">F-statistics</td> <td class="tg-c3ow">0.257</td> <td class="tg-c3ow">0.257</td> </tr> <tr> <td class="tg-0pky">N</td> <td class="tg-c3ow">126</td> <td class="tg-c3ow">126</td> </tr> </tbody> </table> --- # Sources